Sign Language corpus analysis: Synchronisation of linguistic annotation and numerical data

نویسندگان

  • Jérémie Segouat
  • Annelies Braffort
  • Emilie Martin
چکیده

This paper presents a study on synchronization of linguistic annotation and numerical data on a video corpus of French Sign Language. We detail the methodology and sketches out the potential observations that can be provided by such a kind of mixed annotation. The corpus is composed of three views: close-up, frontal and top. Some image processing has been performed on each video in order to provide global information on the movement of the signers. That consists of the size and position of a bounding box surrounding the signer. Linguists have studied this corpus and have provided annotations on iconic structures, such as “personal transfers” (role shifts). We used an annotation software, ANVIL, to synchronize linguistic annotation and numerical data. This new approach of annotation seems promising for automatic detection of linguistic phenomena, such as classification of the signs according to their size in the signing space, and detection of some iconic structures. Our first results must be consolidated and extended on the whole corpus. The next step will consist of designing automatic processes in order to assist SL annotation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Annotated Japanese Sign Language Corpus

Sign language is characterized by its interactivity and multimodality, which cause difficulties in data collection and annotation. To address these difficulties, we have developed a video-based Japanese sign language (JSL) corpus and a corpus tool for annotation and linguistic analysis. As the first step of linguistic annotation, we transcribed manual signs expressing lexical information as wel...

متن کامل

Digging into Signs: Emerging Annotation Standards for Sign Language Corpora

This paper describes the creation of annotation standards for glossing sign language corpora as part of the Digging into Signs project (2014-2015, http://www.ru.nl/sign-lang/projects/digging-signs/). This project was based on the annotation of two major sign language corpora, the BSL Corpus (British Sign Language) and the Corpus NGT (Sign Language of the Netherlands). The focus of the gloss ann...

متن کامل

Dealing with Sign Language Morphemes in Statistical Machine Translation

The aim of this research is to establish the role of linguistic information in data-scarce statistical machine translation for sign languages using freely available tools. The main challenge in statistical machine translation is the scarcity of suitable data, and this problem becomes more pronounced in sign languages. The available corpora are small, usually not domain-specific, and their annot...

متن کامل

Ingesting the Auslan Corpus into the DADA Annotation Store

The DADA system is being developed to support collaborative access to and annotation of language resources over the web. DADA implements an abstract model of annotation suitable for storing many kinds of data from a wide range of language resources. This paper describes the process of ingesting data from a corpus of Australian Sign Language (Auslan) into the DADA system. We describe the format ...

متن کامل

A Colloquial Corpus of Japanese Sign Language: Linguistic Resources for Observing Sign Language Conversations

We began building a corpus of Japanese Sign Language (JSL) in April 2011. The purpose of this project was to increase awareness of sign language as a distinctive language in Japan. This corpus is beneficial not only to linguistic research but also to hearing-impaired and deaf individuals, as it helps them to recognize and respect their linguistic differences and communication styles. This is th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006